ICWAPR 2023
Engineering Failure Analysis 2023
Macromolecular Rapid Communications 2023
We proposes an end-to-end model that fuses a single RGB image and its defocus map using attention mechanisms to estimate depth. Instead of handcrafted fusion, it uses self- and cross-attention for uncertainty-aware refinement.
We used a generative model with DDIM to create extra training images from small datasets, improving few-shot learning. Fixed size issues with interpolation and reduced noise for better results.